75 research outputs found
SegNetr: Rethinking the local-global interactions and skip connections in U-shaped networks
Recently, U-shaped networks have dominated the field of medical image
segmentation due to their simple and easily tuned structure. However, existing
U-shaped segmentation networks: 1) mostly focus on designing complex
self-attention modules to compensate for the lack of long-term dependence based
on convolution operation, which increases the overall number of parameters and
computational complexity of the network; 2) simply fuse the features of encoder
and decoder, ignoring the connection between their spatial locations. In this
paper, we rethink the above problem and build a lightweight medical image
segmentation network, called SegNetr. Specifically, we introduce a novel
SegNetr block that can perform local-global interactions dynamically at any
stage and with only linear complexity. At the same time, we design a general
information retention skip connection (IRSC) to preserve the spatial location
information of encoder features and achieve accurate fusion with the decoder
features. We validate the effectiveness of SegNetr on four mainstream medical
image segmentation datasets, with 59\% and 76\% fewer parameters and GFLOPs
than vanilla U-Net, while achieving segmentation performance comparable to
state-of-the-art methods. Notably, the components proposed in this paper can be
applied to other U-shaped networks to improve their segmentation performance
DiffSeer: Difference-based Dynamic Weighted Graph Visualization
Existing dynamic weighted graph visualization approaches rely on users'
mental comparison to perceive temporal evolution of dynamic weighted graphs,
hindering users from effectively analyzing changes across multiple timeslices.
We propose DiffSeer, a novel approach for dynamic weighted graph visualization
by explicitly visualizing the differences of graph structures (e.g., edge
weight differences) between adjacent timeslices. Specifically, we present a
novel nested matrix design that overviews the graph structure differences over
a time period as well as shows graph structure details in the timeslices of
user interest. By collectively considering the overall temporal evolution and
structure details in each timeslice, an optimization-based node reordering
strategy is developed to group nodes with similar evolution patterns and
highlight interesting graph structure details in each timeslice. We conducted
two case studies on real-world graph datasets and in-depth interviews with 12
target users to evaluate DiffSeer. The results demonstrate its effectiveness in
visualizing dynamic weighted graphs
Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP
In the realm of expressive Text-to-Speech (TTS), explicit prosodic boundaries
significantly advance the naturalness and controllability of synthesized
speech. While human prosody annotation contributes a lot to the performance, it
is a labor-intensive and time-consuming process, often resulting in
inconsistent outcomes. Despite the availability of extensive supervised data,
the current benchmark model still faces performance setbacks. To address this
issue, a two-stage automatic annotation pipeline is novelly proposed in this
paper. Specifically, in the first stage, we propose contrastive text-speech
pretraining of Speech-Silence and Word-Punctuation (SSWP) pairs. The
pretraining procedure hammers at enhancing the prosodic space extracted from
joint text-speech space. In the second stage, we build a multi-modal prosody
annotator, which consists of pretrained encoders, a straightforward yet
effective text-speech feature fusion scheme, and a sequence classifier.
Extensive experiments conclusively demonstrate that our proposed method excels
at automatically generating prosody annotation and achieves state-of-the-art
(SOTA) performance. Furthermore, our novel model has exhibited remarkable
resilience when tested with varying amounts of data.Comment: Submitted to ICASSP 202
TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection
Thanks to recent advancements in end-to-end speech modeling technology, it
has become increasingly feasible to imitate and clone a user`s voice. This
leads to a significant challenge in differentiating between authentic and
fabricated audio segments. To address the issue of user voice abuse and misuse,
the second Audio Deepfake Detection Challenge (ADD 2023) aims to detect and
analyze deepfake speech utterances. Specifically, Track 2, named the
Manipulation Region Location (RL), aims to pinpoint the location of manipulated
regions in audio, which can be present in both real and generated audio
segments. We propose our novel TranssionADD system as a solution to the
challenging problem of model robustness and audio segment outliers in the trace
competition. Our system provides three unique contributions: 1) we adapt
sequence tagging task for audio deepfake detection; 2) we improve model
generalization by various data augmentation techniques; 3) we incorporate
multi-frame detection (MFD) module to overcome limited representation provided
by a single frame and use isolated-frame penalty (IFP) loss to handle outliers
in segments. Our best submission achieved 2nd place in Track 2, demonstrating
the effectiveness and robustness of our proposed system
Loop closure detection of visual SLAM based on variational autoencoder
Loop closure detection is an important module for simultaneous localization and mapping (SLAM). Correct detection of loops can reduce the cumulative drift in positioning. Because traditional detection methods rely on handicraft features, false positive detections can occur when the environment changes, resulting in incorrect estimates and an inability to obtain accurate maps. In this research paper, a loop closure detection method based on a variational autoencoder (VAE) is proposed. It is intended to be used as a feature extractor to extract image features through neural networks to replace the handicraft features used in traditional methods. This method extracts a low-dimensional vector as the representation of the image. At the same time, the attention mechanism is added to the network and constraints are added to improve the loss function for better image representation. In the back-end feature matching process, geometric checking is used to filter out the wrong matching for the false positive problem. Finally, through numerical experiments, the proposed method is demonstrated to have a better precision-recall curve than the traditional method of the bag-of-words model and other deep learning methods and is highly robust to environmental changes. In addition, experiments on datasets from three different scenarios also demonstrate that the method can be applied in real-world scenarios and that it has a good performance
Total saponins from Trillium tschonoskii Maxim promote neurological recovery in model rats with post-stroke cognitive impairment
Total saponins from Trillium tschonoskii Maxim (TSTT), a bioactive component of local natural herbs in the Enshi area, China, have been demonstrated to have functions of restoring cognitive capacity and promoting axonal regeneration post-stroke, but the mechanism of this process remains unclear. The hippocampus is a critical tissue for controlling learning and memory capacity, and the sonic hedgehog (Shh) signaling pathway plays a major role in the patterning and synaptic plasticity of hippocampal neural circuits. Therefore, we aimed to investigate whether TSTT could restore learning and cognitive functions by modulating the Shh pathway in rats with post-stroke cognitive impairment (PSCI). The ischemia model was established by permanent middle cerebral artery occlusion (MCAO) in 100 Sprague–Dawley (SD) rats, and the model rats were administered using TSTT (100 mg/kg) or donepezil hydrochloride as the positive control (daily 0.45 mg/kg, DON) for 4 weeks after the operation. As assessed by the Morris water maze test, the cognitive function of PSCI rats was significantly improved upon TSTT treatment. Meanwhile, the cerebral infarct volume reduced with TSTT, as shown by HE and TTC staining, and the number of Nissl bodies and dendritic spine density were significantly increased, as shown by Nissl and Golgi staining. In addition, TSTT upregulated PSD-95, SYN, and GAP-43, and inhibited neuronal apoptosis, as evidenced by increased Bcl-2 levels along with decreased Bax and caspase-3 expression. TSTT could also significantly upregulate Shh, Ptch1, Smo, and Gli1 proteins, indicating the activation of the Shh signaling pathway. Therefore, TSTT can protect PSCI rats by inhibiting apoptosis and promoting neuronal synaptic remodeling. The Shh pathway is also involved
Extremely thin perfect absorber by generalized multipole bianisotropic effect
Symmetry breaking plays a crucial role in understanding the fundamental
physics underlying numerous physical phenomena, including the electromagnetic
response in resonators, giving rise to intriguing effects such as directional
light scattering, supercavity lasing, and topologically protected states. In
this work, we demonstrate that adding a small fraction of lossy metal (as low
as in volume), to a lossless dielectric resonator breaks
inversion symmetry thereby lifting its degeneracy, leading to a strong
bianisotropic response. In the case of the metasurface composed of such
resonators, this effect leads to unidirectional perfect absorption while
maintaining nearly perfect reflection from the opposite direction. We have
developed more general Onsager-Casimir relations for the polarizabilities of
particle arrays, taking into account the contributions of quadrupoles, which
shows that bianisotropy is not solely due to dipoles, but also involves
high-order multipoles. Our experimental validation demonstrates an extremely
thin terahertz-perfect absorber with a wavelength-to-thickness ratio of up to
25,000, where the material thickness is only 2% of the theoretical minimum
thickness dictated by the fundamental limit. Our findings have significant
implications for a variety of applications, including energy harvesting,
thermal management, single-photon detection, and low-power directional
emission
An H Imaging Survey of the Low-surface-brightness Galaxies Selected from the Fall Sky Region of the 40 ALFALFA \ion{H}{1} Survey
We present the observed H flux and derived star formation rates
(SFRs) for a fall sample of lowsurfacebrightness galaxies (LSBGs). The
sample is selected from the fall sky region of the 40 ALFALFA {\ion{H}{1}}
survey SDSS DR7 photometric data, and all the images were
obtained using the 2.16 m telescope, operated by the National Astronomy
Observatories, Chinese Academy of Sciences. A total of 111 LSBGs were observed
and flux was measured in 92 of them. Though almost all the LSBGs in
our sample are {\ion{H}{1}}rich, their SFRs derived from the extinction and
filtertransmissioncorrected flux, are less than
1M_{\sun}.
LSBGs and star forming galaxies have similar {\ion{H}{1}} surface densities,
but LSBGs have much lower SFRs and SFR surface densities than starforming
galaxies. Our results show that LSBGs deviate from the Kennicutt-Schmidt law
significantly, which indicate that they have low star formation efficiency. The
SFRs of LSBGs are close to average SFRs in Hubble time and support the previous
arguments that most of the LSBGs are stable systems and they tend to seldom
contain strong interactions or major mergers during their star formation
histories
An H Imaging Survey of the Low-surface-brightness Galaxies Selected from the Fall Sky Region of the 40 ALFALFA \ion{H}{1} Survey
We present the observed H flux and derived star formation rates
(SFRs) for a fall sample of lowsurfacebrightness galaxies (LSBGs). The
sample is selected from the fall sky region of the 40 ALFALFA {\ion{H}{1}}
survey SDSS DR7 photometric data, and all the images were
obtained using the 2.16 m telescope, operated by the National Astronomy
Observatories, Chinese Academy of Sciences. A total of 111 LSBGs were observed
and flux was measured in 92 of them. Though almost all the LSBGs in
our sample are {\ion{H}{1}}rich, their SFRs derived from the extinction and
filtertransmissioncorrected flux, are less than
1M_{\sun}.
LSBGs and star forming galaxies have similar {\ion{H}{1}} surface densities,
but LSBGs have much lower SFRs and SFR surface densities than starforming
galaxies. Our results show that LSBGs deviate from the Kennicutt-Schmidt law
significantly, which indicate that they have low star formation efficiency. The
SFRs of LSBGs are close to average SFRs in Hubble time and support the previous
arguments that most of the LSBGs are stable systems and they tend to seldom
contain strong interactions or major mergers during their star formation
histories
Using Lymphocyte and Plasma Hsp70 as Biomarkers for Assessing Coke Oven Exposure among Steel Workers
- …